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LOCAL CENTRAL LIMIT THEOREMS, THE HIGH-ORDER 
CORRELATIONS OF REJECTIVE SAMPLING AND LOGISTIC 
LIKELIHOOD ASYMPTOTICS 

By Richard Arratia, Larry Goldstein^ and Bryan Langholz^ 

University of Southern California 

Let Ii, . . . ,I„ be independent but not necessarily identically dis- 
tributed Bernoulli random variables, and let Xn = ^i- ^'-'^ ^ 
a bounded region, a local central limit theorem expansion of P(Jf„ = 
EX„ + v) is developed to any given degree. By conditioning, this ex- 
pansion provides information on the high-order correlation structure 
of dependent, weighted sampling schemes of a population E (a spe- 
cial case of which is simple random sampling), where a set d C -B 
is sampled with probability proportional to IlAeci^'*' where xa are 
positive weights associated with individuals A£ E. These results are 
used to determine the asymptotic information, and demonstrate the 
consistency and asymptotic normality of the conditional and uncondi- 
tional logistic likelihood estimator for unmatched case-control study 
designs in which sets of controls of the same size are sampled with 
equal probability. 

1. Introduction. The unmatched case-control study is one of the most 
widely used designs in chronic disease epidemiologic research. Typically, a 
large number of individuals, the cohort or study base, will be observed for 
occurrence of a binary disease outcome. Because the number of subjects is 
large and only a small proportion will be cases that contract the disease of 
interest, nondiseased controls are sampled to serve as a comparison group. 
Exposure and other covariate information is then obtained for the case- 
control study subjects for use in statistical analyses. As an example, in a 
study to assess the association of a variety of hypertensive drugs and the risk 
of myocardio-infarction (MI), 623 MI cases who used antihypertensive drugs 
were identified within an HMO in Washington State. The cases were grouped 
by sex, 10-year age, and calendar year of MI [Psaty et al. (1995)]. For each 
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group, a number of controls from the antihypertensive drug users were sam- 
pled in a fixed proportion to the number of cases. For each case-control study 
member, the types of antihypertensive drugs used were ascertained through 
computerized records, chart review and interview. The primary method of 
analysis was unconditional logistic regression. It was found that risk of MI 
was 60% higher among calcium channel blocker users compared to either 
diuretics alone or compared to /9-blockers, a finding that has resulted in a 
change in treatment strategy. 

The structure of these data is prospective in that disease occurrence is 
conditional on the covariate information, and controls are randomly sampled 
from the pool of nondiseased. This is the structure of a nested case-control 
study from the study base [Mantel (1973)], which we call the nested case- 
control data model. Another way to view case-control data is retrospectively 
in which the case and control covariate values are taken to be independent 
realizations from their respective distributions [e.g., Breslow and Powers 
(1978), Prentice and Pyke (1979), Weinberg and Wacholder (1993) and Carroll, Wang and Wang 
(1995)]. Although the nested case-control model is used in modern texts on 
case-control studies in epidemiologic research [e.g., Breslow and Day (1980), 
Kelsey, Whittemore, Evans and Thompson (1996) and Rothman and Greenland 
(1998)], it has been the retrospective model that is invoked when developing 
estimators and analyzing their properties. However, the assumption that the 
case and control covariates are independent random replicates may not hold 
in practice. For instance, if the distribution of drug types changed during the 
antihypertensive drug-MI study, differences in treatment within the case and 
control populations would make the modeling of the covariates by a common 
distribution within each group untenable, so the conditions required by the 
retrospective model analysis would not be met. But, it seems evident that 
valid results can still be drawn from such a study since the assignment of 
drug type to subjects should not influence the association between the drug 
type and disease. 

In this paper we develop the theory necessary to determine the asymp- 
totic behavior of estimators of the odds ratio in the nested case-control 
model under general conditions on the covariates and sampling methods. 
We then apply this theory to the maximum conditional and unconditional 
logistic likelihood estimators. Although the conditional logistic likelihood 
gives rise to valid estimators in a wider range of case-control study settings 
than the unconditional (e.g., individually matched case-control designs), its 
asymptotic properties for "large strata" have not been studied. The path of 
our analysis leads us through some unexpectedly broad territory, including a 
high-order local central limit theorem for the Poisson-Binomial distribution 
and expansions for the inclusion probabilities and correlation structure of 
rejective sampling. 
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After formally introducing the problem of analysis of case-control data 
in Section 1.1, in Section 2 we prove Theorem 2.1, a high-order local cen- 
tral limit theorem for the sum Xn of independent but not necessarily iden- 
tically distributed Bernoulli random variables having success probability 

Pj, j = 1,2, This result gives an expansion to any desired order for the 

probability that the sum Xn deviates from its mean EX„ by the value u, 
uniformly for z/ in any bounded region. This result is of independent inter- 
est, as it provides a means to approximate, with rates, the Poisson-Binomial 
distribution, for which no simple expression exists. 

In Section 3 we extend Theorem 2.1 by showing that this local central 
limit theorem expansion holds for the sums Xe of independent Bernoulli 
variables with success probability 

XxA . ^ 

PA,x = -rT^ — ' Age, 

1 + Ax A 

uniformly for all A in an interval bounded away from zero and infinity, un- 
der asymptotic stability conditions on the weights xa, A€ E. For any A > 0, 
conditioning the Bernoulli variables on the event Xe = rj gives Hajek's re- 
jective sampling scheme '&E,-q on E, where a set d C of size r/ is sampled 
with probability proportional to xj, the product of the weights xa over 
A G d. Choosing A so that the expected number of successes EX^; equals rj 
allows for the application of local central limit Theorem 2.1, yielding Theo- 
rem 3.1, which gives an expansion for the inclusion probabilities under the 
rejective sampling scheme '¥.E,-q- This expansion is applied in Section 4 to 
derive Theorem 4.1, yielding the high-order correlation structure of rejective 
sampling. 

In Section 5 we apply the rejective sampling results to the asymptotics of 
estimators under the nested case-control model. Theorems 5.1 and 5.2 give 
the asymptotic information and demonstrate the consistency and asymptotic 
normality of the conditional and unconditional logistic maximum likelihood 
estimators, respectively. 

Finally, in Section 6, we compare our approach to others, and, in partic- 
ular, to the derivation of asymptotics by Prentice and Pyke (1979) under 
the retrospective model. Lastly, we discuss efficiency issues, extensions and 
directions for further research. 

1.1. The statistical model and likelihood. The prospective logistic model 
for disease occurrence is as follows: with covariate vector z G M^, the prob- 
ability of disease is 

( \x{z;(3) 
^^^"'^) = l + Ax(z;/3) ' 

where x{z, 0) = x{0, (3) = 1, for all z G and (3 in the parameter space B C 
[e.g., Breslow and Day (1980) and Cox and Snell (1989)]. The parameter 
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A > is therefore the basehne odds and x{z,(3) is the odds ratio associated 
with z. The odds ratio parameter f3 is typicahy of primary interest. 

We consider a "study base" TZ = {1, . . . , N} of N individuals with covari- 
ates Zj, j G TZ, and independent failure indicators Ij having marginal distri- 
bution given by (1) for some {Xo,Pq), that is, PAo,A)(-^i = 1) = PAq (zj; /^o)- 
Define Xj{j3) = x{zj; P), Pj^\{f3) = 1 — qj^\{P) =p\{zj\f3); we may further 
suppress (3^ and write, for example, Xj = Xj{(3Q) and pj^\ = pj^xlfB^). Denot- 
ing the set of indices of diseased subjects by D, for d C 7^, the probability 
of observing D = d is therefore 

Pao,/3o(D = d) = n pJ^Ao q]m = ^o" ^d^^, 
where for any F CTZ, 

(2) qp = qF{Xo,(3Q) = Y[{'^ + XoXj)~^ and 2;d=J|xj. 

When covariate values for all study base subjects are available, estimation 
of the unknown (3q (and Aq) can be achieved by maximizing the likelihood 
Pa, /3(D). But when the study base is large or the collection of the full set of 
covariate values is expensive or impractical, it is natural to sample subjects 
to form a sampled study base E C TZ and use the collected covariates in 
the sample for the estimation of parameters. Generally, a sampling design 
is specified by 7r(s|d), the probability of choosing s as the sampled risk set 
E when d is the observed set of diseased subjects. 

For the calculation of a likelihood, additional information that D G 5 for 
some S may be included. Conditioning on E and S leads to the probability 

(3) P.„,«.(D|E,5)- >'P^MEm 



E„cs^I)"'^u(/3(,)t(E|u)' 

A likelihood is formed by allowing the parameters in (3) to vary to obtain 
the likelihood function Li5,5(A,/3) = Pa,/3(D|£;,5). 

Of particular interest for epidemiologic unmatched case-control studies is 
the likelihood which results from (3) when conditioning on the number of 
cases in the case-control set. In practice, in unmatched case-control studies 
one typically has information on all cases and a set of controls obtained using 
sampling schemes such as frequency matching, fixed size sampling, Bernoulli 
trials and case-base sampling [e.g., Kupper, McMichael and Spirtas (1975), 
Breslow and Day (1980), Wacholder, Silverman, McLaughlin and Mandel (1992) 
and Langholz and Goldstein (2001)]. For each of these designs, the probabil- 
ity 7r(s|d) is zero unless s contains d, and is otherwise constant in |d|. Then, 
setting 5 = {u C -E: |u| = rj}, where 77 = |D|, Aq and the sampling proba- 
bilities 7r(s|d) cancel from (3), and noting the dependence of the resulting 
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probability on E and r] only, we define 
(4) FE,,{n)=Fp,{B\E,rj) = -. 

2^uCE: \u\=ri ■'"U 

This is the basis for the "standard" conditional logistic likelihood Le, rji(3) = 
Fp(D\E,7]) [e.g., Cox (1972) and Cox and Snell (1989)] for the designs men- 
tioned above, which have log likelihood 

^E,M = E logXA(/3) - logj ^u(/3)|. 

The conditional logistic likelihood estimator is a value maximizing CE,r]iP)- 

Differentiation of an array F/g = {-Fii,...,^^ (/3)} G with respect 

to /3 will be denoted by resulting in the array F'^ = {Fj^i^^,,,^i^{f3)} = 
{{d/dpj)Fi^^,„^i^{l3)} e MP^^'^i^-^"". For U G W'lx-x^'^ and V e M^ix-x™b, 
the tensor product U V G M"!^'"^""^™!^ "^"** has components Uij^,...,i^ ■ 
^ii,...jb' and we set |U| = En,...,i„ \Un,...,ia\^ the norm. 

Condition 1.1. The real valued function x is positive, three times dif- 
ferentiable in /3 and < inf|z|<cx(/3Q,z) < sup|2|<ca^(/9o' ^) < ^ c> 0. 

Under Condition 1.1, following Barlow and Prentice (1988), define the 
"effective covariates" Zj by 

in the model where Xj{f3) = exp(/3"'"zj), we have Zj = Zj. Now for u C -E, 
define the inclusion probabilities 

pM = C T>\E, 7?) = ^ P^(s|i?, r?), 

sDu 

and the inclusion probability for an individual A as pa{P) =P{A}(/3)- With 
/yi the failure indicator for A £ E and suppressing the dependence of Za 
and Pa on /3, the score dCE,q{P)/d(3 equals 



= E 2a (/a - pa) = E - % ( E 2a 
AGS agd Vagd 



where is the expectation under FE,riif3)- Using that for a function F^(D) 
Eagd we have 

(5) ^E^(F^(D)|i^, r,) = E^(F^(D)'|i?, r^) + E^(W(/3) F^(D)|i?, r?), 
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with PA + QA = I5 the information —dUE,ri{P)/d/3 is given by 

(6) lEM=Y.^fPA1A+ E ZaZKpab-PAPb) 

AeE A,BeE,A^B 

(7) -(y.^'a-T.^'aPa). 

VasD A€E ) 

Note that (6) contains pab — PAPb, the correlation of the joint inclusion of 
A and B. 

In general, we have 

since the score is the difference between a quantity and its conditional ex- 
pectation. For this same reason, when taking expectation in (7), we find 
that 

The standard likelihood argument to show the consistency of (3j^ requires 
that the information \E\~^I{(3q) converge in probability. Since the informa- 
tion is a double sum over A and B, the inclusion correlations pab — PAPb 
need to decay at rate Further, the remainder term in the Taylor ex- 

pansion of the log likelihood, which is required to stay bounded in probabil- 
ity, contains a triple sum of terms multiplied by the third-order correlation, 

Ep[ilA-pA)ilB-pB)iIc-pc)\E,v]; 

hence, to satisfy the boundedness condition, such triple correlations need to 
decay as l-El"^- The dependence in fE,ri created by having the probability of 
a set proportional to the product of its individual weights has been explored 
only under very restrictive situations [Harkness (1965) and Farewell (1979); 
see also Hajek (1964)]. Theorem 4.1 gives information on the rate of decay on 
all correlation orders, and, in particular, provides that the third-order cor- 
relation decays at the required rate. This result allows for the full treatment 
of the asymptotic theory for the conditional logistic maximum likelihood 
estimator for a large class of case-control sampling designs (Section 5). 

More commonly used in practice, and making use of the same case-control 
subject data, is the estimator of /3o based on maximizing the "unconditional 
logistic likelihood" which, with pA,x{(3) as in (1) and qE{X,l3) as in (2), is 
given by 

(8) Le{X,(3) = n PA,xil3y''qA,x{P)'~'-' = X^^^xMqEiX,^)- 

Age 

The unconditional logistic likelihood estimator is a value maximizing 
Le{P)- Note that, in general, Le is not a true likelihood when data is 
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collected using sampling methods such as frequency matching, since the 
contributions from individual subjects are not independent. The asymptotic 
analysis of the unconditional logistic estimator is carried out in Section 5. 

1.2. The probabilistic setup. For any set E and < < \E\, consider the 
probability measure P£;^^(d) given by (4), supported on the size r/ subsets 
of E. With = 1(A € D), the indicator that A is included in D, pa = 
^E.rjilA), and H C E, we study high-order correlations of the form 

(9) Corr{H) = E^,^ ( H (^^ " ^'^) ) • 

\A£H ) 

When H = {A,B}, a set of size 2, Corr(i7) = pab — PAPb, the covariance 
between the Bernoulli variables I a and Ib- 

When xa = ^ for all ^ € [corresponding to /flg = in (1)], P^; reduces 
to simple random sampling. In this case, when there exists r G (0, 1/2] such 
that the sampling fraction rj/lEl G [r, 1 — r], then as — > oo, 

(10) EeA{Ia-Pa){Ib-Pb)] = ^E^il^El'-l = ^-d^l"') 
and 

(11) EeA{Ia-Pa){Ib-Pb){Ic-Pc)] = Or{\E\~^). 

Hence, simple random sampling has the rates needed for the stability of the 
information and the control on the remainder in our likelihood analysis; the 
exact meaning of Or is given in Definition 2.1. 

For simple random sampling a straightforward calculation shows that 

Corr(g)=y ^^J-^M -g) 

Since here the weights xa are equal, we may write Corr(A;) for the common 
value of Corr(ii") for all H of size k, and have verified for k < 10, as \E\,r] ^ 
oo, with r]/\E\ f € (0, 1), for M a standard normal variate, 

lim lEl''^^ Corr(A;) = EAA'=(/(/ - 1))^=/^ for k even 
|£;|-»oo 



and 



lim \E\^^+^'^/^ Corr{k) 



= lik - l)EAA^+i(/(/ - l))(*^-^)/2(2/ - 1) for k odd. 

In particular, for simple random sampling we have 

(12) Corv{H) = 0|H|,r(|Br^'-^'+'-^'°'°'^^^/^), 

with (10) and (11) as special cases. Theorem 4.1 shows that the orders in 
(12) are obtained quite generally for the weighted sampling scheme P^,??- 
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1.3. Rejective sampling. The scheme corresponding to the probability 
measure Pb,?} is known as rejective samphng [Hajek (1964)], and as seen in 
Section 1.2 includes simple random sampling as a particular case. Though 
simple random sampling is the most ubiquitous of all statistical methods, in 
some cases it is not possible to take a simple random sample. For example, 
the inclusion of the population member A might be influenced by a certain 
nonnegative "size" xa associated with the item A, where the larger the size 
of an item, the easier it is to locate and the higher the probability of its 
inclusion. 

The term rejective sampling arises since ^E,ri may be achieved by sampling 
r] individuals independently with replacement and rejecting those samples in 
which the r] individuals are not distinct. Hajek (1964) considers the inclusion 
probabilities, second-order correlations and asymptotic normality of sums 
obtained by rejective sampling. 

Schemes where objects are sequentially sampled proportional to their size 
have been extensively studied [e.g., Rosen (1972) and Gordon (1983)]. How- 
ever, rejective sampling differs from sampling sequentially proportional to 
size when r] >2, as can be seen by comparison of the general probability 
that a sample of size rj = 2 results in the units A and B. However, both 
schemes reduce to simple random sampling when the weights are constant. 

2. A high-order local central limit theorem. The main result of this 
section is Theorem 2.1, a local central limit theorem expansion for the dis- 
tribution of Xn, the sum of independent but not necessarily identically dis- 
tributed indicator random variables. The first step. Lemma 2.1, is to obtain 
an expression for the characteristic function of the centered sum, Xn — 1EX„. 
In the following, we write for a complex number, not necessarily the same 
at each occurrence, such that \Q\ < 1. 

Lemma 2.1. Let 

n 

Xn = ^Ij, 

where Ij,j = 1, . . . ,n, are independent Bernoulli variables with Elj = pj = 
1 — Qj ; let 

n n 

(13) vl = ^pjqj and Wn = ^Pjqj{pj - qj). 

j=i i=i 

Then, denoting the characteristic function of Xn — lE^n by 4>nit), for all 
n = 1, 2, . . . and \t\ < 1, 

(14) ,/.„(*) = exp(^ _!l + i_Il + _nny 
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Furthermore, for all t G [— 7r,7r], 

(15) |(/<„(t)|<exp(-t\2/6). 

Proof. The characteristic function of an indicator / which has been 
centered by subtraction of its mean p is 

We have for all t, 

qe-^'^ = q{l-itp- + ' + y/^) ' 
and adding the analogous expansion for pe**'', we obtain 

Using that pq{p — q) < a/3/18 < 1/9, we have for \t\ < 1, 



^2 ^3 

—pq + i—pq{p -q)+ ^^^^ 



1 \t\-^ 1 r 5 a 1 

< h — \ < — r < -. 

-24 69 24-27-2 



Applying the estimate 
lo 

we obtain that for \t\ < 1, 



log(l + x) = X + Qx^ V|x|<^, 



log(Ee^*(^-P)) = -^pq + ijpqip - q) + ^j^t^ + ^ (^^^^ 

= --^PQ + i-^pqip -q) + -^^^ 



and now summing, 

log0„(t) = --Y^Pjlj + ^-^Y.P^1i(Pj - «i) + "Yo^- 

Exponentiating gives (14). 
To prove (15), observe that 

n n 

\Mt)\ = l[\^e^'^'~'''^\ = Ilip] + Q] + '^P,Ucosit)f' 
i=i i=i 

(n \^^'^ f n N 

n(l - 2(1 - cos{t))p,qj) < exp -(1 - cos(t)) J^p.r?,- 
j=i J \ j=i / 

and then use (13) and that 1 — cos(t) > t^/6 for all — vr < t < vr. □ 
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Definition 2.1. For a possibly empty set of parameters fi, we will write 
fn = 0^{gn) if there exist a constant and an integer n^, both depending 
only on ^, such that 

(16) \fn\ < C^bnl for all n > n^; 

we write /„ = o^{gn) if for every e > 0, there exists such that (16) holds 
with replaced by e. We write /„ = Q^{gn) if fn = Ofj,{gn) and gn = 

In the remainder of this section, recalling v'^ = Yl^=iPj^ji ^'^^ assume 
the following: 

Condition 2.1. There exist e > and such that > en for all 
We will again let J\f denote a standard normal variable. 



Lemma 2.2. Let an = yjClogn/n for C > 0. Then under Condition 2.1, 



(17) 27r7|t|<a„ 



Proof. By the change of variable z = Vnt, the left-hand side of (17) 
becomes 

vn'^'^'^ f , „exp(-zV2) 



J\z\<a„v 

n 

-(E\Ar\^ -2mn{Af>anVn)), 



but 

Wl{M > anVn) < EMn{M > VCelogn) = Oe,j,c{'^), 
as n — > oo, by the dominated convergence theorem. □ 

For a bounded function on [— 7r,7r], define 

oo = sup \f{t)\. 

\t\<TT 
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Lemma 2.3. Under Condition 2.1, for any K > and f{t) a hounded 
measurable function on [— 7r,7r], setting 



'C\ogn/n with C > 6e~^K, 
we have 

f{t)Mt)dt=\\f\\ooOin-''). 



' an<\t\<TT 

Proof. Using Lemma 2.1, 



f{t)Mt)dt 

a„<\t\<n 



ll/lloo/ \Mt)\dt 

J an<\t\<.7V 



< ll/lloo / e— 

-'a„<|t|<7r 

<27r||/|Ue-"^'^"/6<2^||/|Un-^. □ 



Lemma 2.4. Let 4>n{t) be the characteristic function of the sum of n 
independent centered Bernoulli variables, and suppose that Condition 2.1 
holds. Define for j > 0, 

(18) I^j = ^ f t^nit)dt and 2-0 . = !^. 

^vr J\t\<TT -Ln,0 

Then for j even, 

(19) J„j = e,j(n-(^+i)/2) Il. = v-W +Oe,j{n-^/^), 
and for j odd, 

T„,=0,,(n-(^-+^)/^); 

in particular, 

I^. = Oej(n-(j+^°^°^2)/2) ■ > Q_ 

Proof. Let a„ = ^/C\ogn/n with C = 6e"^ ((j + 3)/2). Lemma 2.3 
yields 

/ t^(/>„(t)(it = 7rJO(n-(^+3)/2), 

•f an<\t\<-K 

SO it suffices to consider the region |t| < a„. Take n^j so that for n > n^j, 
a„ < 1 and na^ < 3. Since |t| < On < 1, (14) of Lemma 2.1 gives 

Mt) = ew[^ — 2^^'~6^ + "* To j 

= exp exp + . 
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For n > n^j, we see that na^/10 < < 1/2. Therefore, for |t| < a„, 

\x\ ^ 1) where x = it^Wn/(i + Now using the fact that for |x| < 1, 

e^ = l + x + 0(x2), 

we have 

Mt) = «xp(-^) (l + ^'^) + «-p(-^) 0{nt' + t'wl). 



In,j, since 



Lemma 2.2 shows that the second term contributes 0£j{n to 
/ |tp+4expf-^Une,,,(n"(^+5)/2) = e,,,(n"(^+3)/2) 

J\t\<a„ V 2 / 



/ I.1 1-1-4 / 

n 

l\t\<a 

and 

+2„,2 

/ |tP+'exp 

J|t|<a„ 



Now focusing on the contribution from the first term, using v"^ = 0(n) by 
Condition 2.1, symmetry and Lemma 2.2 for j even, we have 



27r J|t|< 

^/ .exp(-^).. + 0.,(n-) 



^ (EAA^' + o,,,(l)), 



^27r 

yielding (19). 

For j odd, again using symmetry, 

,i4-9'l/9.-^ r-i4-9'l/9 J- ; .1 / 



127r 



t\<a„ 



6^2^ V^^^ 

the right-hand side is now seen to be 0£j(l). □ 

For MXn + an integer, define 
(20) /„,, = P(X„ = EX„ + z.). 

The following theorem gives a high-order local central limit for the proba- 
bilities of such deviations from the mean EX„. 
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Theorem 2.1. Let Ii,l2,--- be independent Bernoulli variables with 
Pj = KIj satisfying Condition 2.1. For any nonnegative integer s, define 

(21) m,{s) = j2^^^^nj. 

Then for given k and even s, 

= m^{s) + ee,«;,s(n~(^+^)/2) for all \u\ < k with EX„ + 

Proof. Let 

Rn,u — fn,i/ ~ ^ ^ Tj '-^n,'ji 9{-^) = e ^ ^ — rj" 

j=0 ^' j=0 ^' 

and an = \/C\ogn/n with C = Qe~'^{s + 2)/2. By the inversion formula, 

fn,u = ^ I e~'"'Mt)dt, 
27r J\t\<TT 



so 



2'KRn^u= g{-itv)(j)n{t) dt 

J\t\<-K 



g{-itv)(l)n{t) dt + / g{-itu)(t)n{t)dt. 

\t\<an Jan<\t\<1T 

Since |(/>n(t)| < 1, < k and \g{x)\ < C<j|x|'^"'"^ for < vr, the first integral 
is bounded by 



/ \g{—itu)\dt<2an sup \g{—itiy)\ 

J\t\<a.n \t\<a„ 



n 



Since sup|j|<^ < Cs(7rK)^"^^, Lemma 2.3 shows that the second in- 

tegral is OK,s{n~^'^~^'^'^/'^) C Oe,K,s((logn/n)('*+^)/^). Consequently, for all s, 
we obtain 

//lognX(^+4)/2 
j! ' ' VV 

When s is even, we have by Lemma 2.4, 

J„,,+i = 0,,,(n-(^+3)/2) 2-„,,+2 = G,,,(n-(^+3)/2); 

we now obtain the result by observing 



n 



□ 
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3. Finite population sampling and inclusion probabilities. To extend the 
results of Section 2, let there be given for all >1 E N = {0, 1, . . . } a "weight" 
XA ^ 0, and for A > 0, let Tx be the measure under which Ia for yl G N are 
independent indicator variables with success probability 

The case considered in Section 2 corresponds to A = 1 and xj =Pj/qj- 

We will assume that the xa weights are "asymptotically stable" in the 
following sense. 

Condition 3.1. For ah 6 G (0, 1), there exist e G (0, 1) and n > 1 such 
that for any finite C N with \E\ >n, 

(23) 1 ^l(a;^g[e,e-l])>l_5. 

Now let PA,x + qA,x = 1, 

(24) Xe=Y^ Ia, x = PA,xqA,x, 

Age Age 

and with T\{Xe) denoting the expectation of Xe with respect to Tx and 
Tx{Xe) + an integer, set 

(25) fEXu = Tx{XE = Tx{XE) + iy). 

In this section we will provide a local central limit theorem expansion 
for the probabilities in (25) which holds uniformly for A in an interval 
bounded away from zero and infinity. Conditioning Tx to have exactly rj 
successes over E yields E^;.^ (Lemma 3.5), and by selecting the A which 
yields Tx{Xe) = f?, we obtain a high-order expansion for the probability 
that A is included in a sample with distribution P_E,r?- 

With Je^u a real valued function defined on finite subsets C N and 
G M, for a possibly empty collection of parameters fi, we say 

fE,^ = Ofj,{gE) 

if there exist and such that 

\fE,u\ < C^IqeI for ah \E\ > n^. 

We say fE,u = Q^l{gE) when Je^v = 0^{gE) and qe = 0^{fE,u)- Note that 
if H and G are any fixed finite subsets of N, then fEu = Ofj^{\E\~'^) if and 
onlyif /E,, = 0^(|(ii;\F)uG|-'^). 

To see that Condition 3.1 implies Condition 2.1 in Section 2, uniformly 
for A in an interval bounded away from zero and infinity, we have: 
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Lemma 3.1. Let Condition 3.1 hold and 7 G (0, 1]. Then there exist > 
and n.y such that 

Vex ^ ^il-E"! for all A € [7, 1/7] and \E\ >n^. 

Proof. Letting 5 G (0, l),e G (0, 1] and n be any values satisfying (23), 
and set 

^\-'^^\ , and n, = n. 
^ (l + 7e)(l + 7-ie-i) ^ 

Then for any \E\ >n^ and AG [7, 1/7], 

Vf\ = y ; ; > / ^ > e^lEl. 

i^l + Axyil + AxA -^1+7^:^1 + 7"^a;A ~ ^' □ 

Now let 4>E,\ be the characteristic function of Xe — T\{Xe) under the 
measure T\, and in parallel to (18) and (21), write 

lE,\,j = 7r t^4>E,\{t)dt and m£;,A,!.(s) = V — 1e,\,3- 

^■^J\t\<n ^ 

Lemma 3.2. Zei Condition 3.1 6e satisfied and 7 G (0,1]. T/ien /or all 
Ag [7,1/7], /or j even, 

(26) lE,xj = e^,jm~^'^'^^^) arid llx,j = VE^x^^' + 0'yA\Er'^^), 
and for j odd, 

(27) iE,x,=Q,,m-^'^'^^'y, 

in particular, for all j, 

(28) 1% = 0,,,(|i?r(^+^-°d2)/2) for j > 0. 

Further, for given k and even s, for Tx{Xe) + G N, 

(29) fEX. = mE,xAs) + e^,,,,(|Sr(^+=')/2) < ^_ 

Proof. Lemma 3.1 in conjunction with Lemma 2.4 gives (26)-(28), and 
in conjunction with Theorem 2.1 gives (29). □ 



Now for t = {0,1,...} let 



^^fE,u = fE,v 



(30 

AVi?,. = /e,., ^fE,u = fE,u-fE,u-i and A*+i = AA*. 

For q a nonnegative integer, the following classes of functions fE,y play 
a crucial role: 

(31) Gl = {fE,u-yt > 0, AVe,. = 0,,j,,(|i?r(*+'?+(*+'?)-°<^2)/2)|^ 
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Lemma 3.3. Let p< q be nonnegative integers, and suppose that Je^u £ 
and gE,v G QJ^- Then 

(32) Gl^Gl, 

(33) afE,u, fE,u + gE,u G Gf,, 

(34) Vt>0 AV£,.Ge*+^ 

(35) Vt>0 

(36) Vt>0 ^'fE,u(^Gf,, 

(37) fE,ugE,u(^Gf^''- 

Proof. Without loss of generality take fi = 0. Equation (32) follows 
since p + j + {p + j) mod 2 is increasing in p. Equation (33) follows by (32) 
and the linearity of A. Equation (34) follows from the definition of Q'='. 

For (35), write 

t-i t-i 

fE,v — fE,v-t = ^ fE,v~j — fE,v~j~l = ^ ^fE,v~j', 
j=0 j=0 

by (34), the summands are in Q^~^^, and hence by (33), so is the sum itself, 
proving (35). Now ^/em = fE,y~i = fE,v - Me,v G Q'' by (34) and (33); the 
case for general t in (36) follows by induction. 

The verification of equation (37) can be accomplished using the fact that 
A*^-^ = ^'•'A* for all nonnegative j^t and the following product rule which 
can be easily proved by induction: 

l^\fE,u9E,u)= E fj.)(^*-^AVi.,.)(A*-^5i.,.). 

o<i<t 



For notational ease, we suppress the variable s in the quantity ^ 
defined below. Lemmas 3.3 and 3.2 have the following consequence. 

Lemma 3.4. Let Condition 3.1 hold and 7 G (0,1]. Then for all A G 
[7,1/7], 

(38) <x,.^t^-^^E,x,^0l- 

j=Q 

Further, defining 

(39) riE^x^u = m%Xy ~ 'PA,\^'m\^x,v. 
we have 

(40) n%,,-l = 0,,sA\E\-') 
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and 



(41) 




;0 

'-f,s' 



Proof. Note that AV-' = for j < t, and hence for < i < s 




= o^,,,i,,(|£;|-(*+*-°d2)/2^_ 



For t > s, A*m^ = 0. This proves (38). 
Now (40) follows from g = 1, 

m%,x,u = '^ + 0^,sA\E\-^) and Am%^^^, = 0^^sA\E\-^)- 

By (38), we have mE.\,y ^ ^7,s and Am^^^^^j^ G ,,, and (32) and (33) of 
Lemma 3.3 give he^x^u G ^^,5 upon noting that pA,x is constant in u. Since 
lego,, applying (33) again gives (41). □ 

Let be a finite subset of N, and recall Xfi = JlAed and the probability 
distribution ME,r] given in (4). For convenience, we will write, for instance, 
FeA^) in place of P£;,^(^ G D), or Pi?,,,(s) for Fe^^ C D). Also recall the 
product measure Tx with marginals given by (22), such that for all d C -E, 



The following lemma provides a key relation between Tx and P_E,r;; the quan- 
tity Xe is as in (24). 

Lemma 3.5. For any {E,r]) with < rj < \E\,d C E with |d| = rj and 
X>0, 



(42) Tx{{A GE:lA = l} = d) = \\^\ ( [] 



1 + XxA 



1 



) 



\AeE 



(43) 



FE,r,{d) = Tx{{A eE:lA = l} = d\XE = v) 



and for A£ E and F = E\A 



(44) 



^E,r,{^) 



Pa,xTx{Xf = r] - 1) 



Pa,xTx{Xf = r? - 1) + qA,xTx{XF = v) ' 
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Proof. Summing (42) over subsets of E of size rj gives 

(45) r.(X,=,)=A'(n^) J 

\AGa / uGE,\u\=ri 

and now, since |d| =r], division of (42) by (45) yields (43). Next, 

Tx{Xe = v) 

^ Tx{lA = l,XF = r]-l) 

Tx{Ia = hXF = r]-l) + Tx{Ia = 0,XF = r])' 

and (44) now follows using the independence of the variables Ia and Xp un- 
der Tx . 

□ 

In the following, for r G (0, 1/2], let 

£r = {iE,rj):T<7]/\E\<l-T}. 

Lemma 3.6. Suppose that Condition 3.1 is satisfied. Then for all r G 
(0, 1/2], there exist 'jr G (0, 1] and Ur depending only on r such that for all 
{E,rj) G £r with \E\ > rir, there exists a unique solution A = X{E,rj) to the 
equation 

(46) hsiX) = 7-^, where hE{\) = tt? ^1 PA,\ 

II II A&E 

and 

(47) A(i?,r?)G[7.,l/7r]. 

Proof. Let 5 = (1/2) min{r, 1 — 2t} and take e and n,- = n satisfying 
(23) for this 5. Then for ah \E\>nr and A > 0, 

Hence, hE{X), continuous and strictly increasing on [0,cxd) as a function 
of A, satisfies 

lim hE{\) < 5 and lim hE{\) >1 — 5. 

A— >0 A— »oo 

Since 6 <t < r]/\E\ <1 — t<1 — 6, there exists a unique value X{E,r]) in 
(0, cxo) for which hE{\) takes on the value r]/\E\. 
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Since r]/\E\ G [r, 1 - r] and A(^,r?) solves (46), by (48) 

^ ^1 + A(^,r?)e- 'l + \{E,ri)e-^ 

yielding, respectively, 

1 - r eir -5) 

Verifying that < (r — 5) /(I — r) < 1 completes the proof of claim (47). □ 

Theorem 3.1. Suppose that Condition 3.1 is satisfied and let he given 
T £ (0, 1/2] , K G N and s even. Then there exists such that for all {E, rj) £ 
£r with \E\ > Ur, A = X{E,r]) exists, and for all \k\ < k, 

s/2 
1=0 

for all A £ E, where F = E \ A, v = k + Pa,\, o^^^ "^f a v ^'^'^ ^'fxu ^'"^ 
defined in (38) and (39). In particular, for all |z^| < k, 

(50) ^E,r,+k{A)=pA,x + Or,^i\E\-') and 

AFE,^+k{A)=pA,xqA,xlF,x,2 + OrA\E\~^)- 

Proof. By Lemma 3.6, the solutions A = X{E, rj) exist for all (-E, rf) G 
with \E\ >nj- and lie in an interval [7,-, l/7r] for some 7,- G (0, 1] depending 
only on r. 

Hence, first applying Lemma 3.5, 

Pa,xTx{Xf = T] + k - 1) 



PA,xTx{Xf = r] + k-l)+ qA,xTx{XF = 7] + k) 

(for all A > 0) 

Pa,xTx{Xf = v + k-l) 



PA,xTx{Xf = 7] + k-l)+ qA,xTx{XF = r] + k) 

(upon setting A = A). 
Since r^ + k = Tx{Xe) + k = Tx{Tf) + pa,x + k = Tx{Tf) + u, 

PA,xfF,X,u-l 



probability equals 



PA,xfF,\u~l + qA,xfF,X,u 

Letting Q{s/2) = Qr.K,s{\E\~^^'^) for short and applying Lemma 3.2, this 



PA,xinF,x,u~i + Q((g + 3)/2) 

PA,xmF,x,u-i + qA,xmF,x,u + 0((s + 3)/2) 
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PA,A"iRA,.-l + + 2)/2) 



PA,xm%.^^^^^ + (?A,A"^F,A,. + + 2)/2) 

[since Ti.,A,o = e.(|^|"i/2)] 



, + e((s + 2)/2) 



„0 
F,Xm 



+e((. + 2)/2). 



1 + «,A,.-1) 

Equation (40) of Lemma 3.4 gives n^xu ~ ^~ Ot,i/,s{\E\~^), hence a Taylor 
expansion in x of the quotient 1/(1 + x) to order s/2 yields an error term 
of order in%x,u - 1)'/^+^ = Or,i.,s(|-E|-('^+2)/2), and therefore (49). 

Using s = 2 in (49) and collecting terms of order Or,u{\E\~'^), we obtain 

rE,^+k{A) =PA,X{^ + qA,x{^F,X,l + ('^ - V2)X^,A,2) + Or,.(|i?r')), 

proving (50). □ 

Under the hypotheses of Theorem 3.1 we have the following: 

Corollary 3.1. 
(51) ^E,v+k{A) G 

Proof. For t = 0, A^FE,r,+k{A) = FE,n+kiA) = Or,k{l)- Given arbitrary 
t > 1 , take 

s = t + tmod2 — 2 and K = t. 

Since rrPpxy G Qr and (41) of Lemma 3.4 gives n%-xu ^ ^tj repeated 
application of Lemma 3.3 shows 

s/2 

PA,xmF,X,uY.('^F,X,,. - 1)' G 
1=1 

For the error term, by the choice s + 2 = t + t mod 2, 

A*0,,t(|^|-(^+2)/2) ^ ^iQ^^^(|^|-(t+tmod2)/2) ^ q^^^ ( |^| -(i+i mod 2)/2 ^ ^ 

Therefore, 

A'FE,r,+k{A) = 0,,t,fe(|i?r(*+*-°'i)/2) 
for all t > 0, and (51) follows. □ 
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4. High-order weighted samphng correlations. For E CN and sets (u U 
C E with un V = and > |u|, we define the (conditional) measure P^^ 
supported on sets d C of size rj with d D u and d n v = by 

P^'_;(D = d) = Pij,„(D = d|D D u, D n V = 0); 

that is, P^;'^ is the measure ^E,ri conditioned to contain every element of 
u but none of the elements of v. The measures considered in the previous 
sections were the unconditioned special case 



the 0,0 superscript may be omitted. We define (commutative) differences 
A'^ on the measure P^;'^ for i? G \ (u U v) by 

For A; G N the operators A'"' will continue to be used in accordance with 
(30). 

The following lemma gives some key properties of the conditional mea- 
sure P^-'^, including a useful relation to its unconditional version P_B,r?- 

Lemma 4.1. Let u,v be disjoint subsets of E. For H C E\{uU\'), 

(52) A^P^;;= (-l)l/3|p^Ua,vU/3_ 

aUI3=H,anf3=0 

For dc E such that d D u and d Pi v = 0, 

(53) FE',;(d)=IPi?\(uUv),,,-|u|(d\u), 
and for A ^ (u U v) and H C E\{uUv[J A), 

(54) A^P^;;(A) = (-l)l^lAl^lp^'^^^^^(A). 

Proof. Relation (52) can be shown by induction. By definition (4) and 
that of conditional probability, using ucdc£'\v, we have 



X^w : uCwC-B,wnv=0,|w|=r; -^w 

^d\u 

X/w: uCwC-B,wnv=0,|w|=77 -^wXu 



since both d and w contain u and the factor Xu, which appears in both 
= 2^d\u2^u and Xfff = x^yu^u, can be cancelled. Furthermore, because u, v 



22 R. ARRATIA, L. GOLDSTEIN AND B. LANGHOLZ 

are disjoint, w n v = if and only if (w \ u) n v = 0, and when u C w and 
|w| = rj, then | w \ u| = r/ — |u| . Hence, 

{w \ u:uCwCi?,wnv = 0,|w| =rj} 

= {w \ u:w\uCi?\u, (w\u)nv = 0,|w\u|=r7 — |u|} 

= {w :wC E\u, wnv = 0, |w| =i] — |u|} 

= {w :wC E\ (uUv), |w| =??— |u|} 

and P^'^(d) equals 

^^^^ — = lPs\(uUv),7?-|u| (d \ u). 

Z^w : wC-E'\(uUv),|w|=»y-|u| "^w 

This proves (53). 

It suffices to prove (54) for (u, v) = (0, 0). First, note for aD P, 

d3A,aCdC-E,dn/3=0 

= E ^E\iaUl3),ri-\a\{fi) 

d9A,aCdC-E,dn/3=0 

= E ^E\iaUl3},r)-\a\{fi) 

d3A,dc£\(aU/3) 

hence, since H, 

A^FeAA)= E i-lf^P^^A) [by (52)] 

aUI3=H,ani3=0 

(-l)I^IPM("U/3),,-|a|(^) [by (55)] 

aUl3=H,ani3=0 
\H\ 

= E E (-l)l^l-^P£;\^,,-,(^) 

j=OaUf3=H,anl3=0,\a\=j 



j=0 

j=0 

{-l)\^\A\^\FE\j,^^iA). □ 
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In parallel to definition (31), for functions /e for which versions /^'^ are 
defined [such as /e = IP^'.'^(^)i, for G C N let 

r^(G') = {fE : A^/e = |^|(|^|-(l^l+9+{l^^l+9)mod2)/2) all i7 H G = 0}. 
In parallel to Lemma 3.3, we have the following: 

Lemma 4.2. Let p < q be nonnegative integers, and suppose that /f £ 
rP(P) and QF £Tl{Q). Then 

PCQ =^ rp{P)DTl{Q), 
afF^TiiP), f^ + gperp{PuQ), 

pnH = =^ A^/i7Grp+l^l(Pui?), 
/(FW)uGer^(P), 

fFOF^TP+'^iPuQ). 

The proof, being parallel to that of Lemma 3.3, is omitted. 

Lemma 4.3. Let Condition 3.1 hold and r G (0,1/2]. For {E,r]) £ £r, 
G D (u U v) and Gn {A} = 0, 

(56) p^';(A)er°(GuA), 

(57) p^;;(A)P^';(^)GrO(GuA), 

(58) p^;;(^)-P£;,,(^)Gri(GuA). 

Proof. For H C E\{GU A), hy (54) and (53), 

A^p^;;(A) = (-i)i^iAi^ip^'^-^_^(^) 

= (-1)1^1 Al^lPs\(^,uuuv),^-|u|(^)- 

The result (56) now follows by (51) of Corollary 3.1. Since 1 G r°(G U A), 
we have P^'^(^) = 1 - P^'^(A) G rO(G U ^), and, hence, (57) using Lemma 
4.2. 

Next, if B G V / 0, 

ipr.(^) = -aX',;\^(a), 

which is in r^(GU A) by (56). Iterating over all elements in v and using the 
fact that r^(GU A) is closed under addition, we obtain 

(59) p^;;(^) - p^^^'^(^) G (G U ^). 
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Next, for i3 G u U V, 

= (1 - rE,,{BmZ{A) + Pz,,,(^)P|f (A) 
= p|^(A) -P^,,(^)(P|';(A) -P|;f (^)). 

Rearranging, 

Since P£;,,;(5) e T^^{G U ^) and A^P£;,^(A) G ri(G U A), their product is 
in A^Pe,^(^) G TI{G\JA) by (4.2) and, therefore, ^l'^{A) - Pi?,^(A) G 
r^(G U A). Iterating over aU S G u U v and using the fact that r^(G U A) is 
closed under addition, we have 

p^^"'^(A)-Ps,^(^)Gri(Gu^), 

and now by (59) and the closure property of T\{G\J A) (58) follows. □ 
For short, write 

pY = P^;^(^) and qY = 1 - pY 
as usual, for (u,v) = (0,0), we omit the superscripts. 

Lemma 4.4. For any random variable V and yl ^ u U v, 

^7:,i{iA-PA)v) = {{pT-PA)+pT<iT^'')Ki:,{y)- 

Proof. Adding and subtracting p"'^, we have 

e^',;((/a - pa)v) = eY,{{Ia - pT)V) + (pT - pa)e^',; (v) 

and 

=pTKT''ii^-pT)y) + ii-pT)^Z''^(-pTv) 

= Pa 9 a (.^e,v iV)-^E,v (^)) 

U,V U,V A AtuiUjV r\ i— I 

= Pa 9 a ^ ^E,rjiV)- ° 

Theorem 4.1. Let Condition 3.1 hold and {E,7]) G Sr for r G (0, 1/2]. 
// u, V are subsets of E with G D (u U v) and V is a random variable such 
that 

Er,(^)en(G), 
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then forGnH = 0, 

\a£H / 

In particular, when V = 1 and G = u = v = , since 1 £ T^{0) , we have 
Corr{H)=EE,J [] {Ia-Pa)] GTl^k^), 

and, therefore, in particular, 

Corr(//) = 0,,l^|(|£;|-(l^l+l^l-°d2)/2^_ 

Proof. For H = {A}, by Lemma 4.2, 

A^E^;;(T/)Gr^+i(GUyl); 
since G r^(G), using Lemma 4.2 again yields 

Since 

pY-pa^tHguA), 

we also have that 

{pY-PA)E''£,{v)eri+\GuA). 

The result for H = {A} now follows from Lemma 4.4, and then, in general, 
by induction. □ 

We close this section with some results which will be useful in Section 5. 

Corollary 4.1. Under the hypotheses for Theorem 4.1, for A,B,G 
distinct, 

PA 

^E,ri{lA - Pa){Ib - Pb) 
EE,-q{lA- PAf{lB - Pb) 
EE,r){lA - Pa){Ib - Pb){Ic - Pc) 



= PA,X + Or{\E\~^), 

= -PA,\qA,\PB,\qB,\VE% + Or{\E\~^), 
= Or{\E\~^), 

= OA\E\~'). 
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Proof. The first claim is a consequence of (50). With F = E\{AU B), 

^E,rj{lA -Pa){Ib -Pb) 

= pAqA^'^^E,r){lB - Pb) (by Lemma 4.4) 

= pAqA^'^^E,-n{B) [since E^'^^(pb) = Pb for ah u, v] 

= -pAqA^^E\A,r,{B) [by (54) of Lemma 4.1] 

= -PAqAPB,\qB,\^F,\2 + Or{\E\~'^) [by (50) of Theorem 3.1] 

= -PA,\qA,\PB,\qB,\^F,\2 + Or(|-E|~^) 
= -PA,xqA,XPB,xqB,XVF% + Ot(|£^|~"^) 
= -PA,\qA,\PB,xqB,XVE% + "t{\E\~^) 
Further, 

^E,r,{lA - PAf{lB - pb) 

= PA{l-pA)''^i'^,{lB -pb) + (1 -Pa)p\K^{Ib-Pb) 
= Pa{1 - PA)i{l - PA)iPB^ -pb) + PAiPB^ - pb)) 

= 0^(|£;|"^) (by Lemma 4.3), 
and the final claim is immediate from Theorem 4.1. □ 



[by (19) and (50)] 
[by (19)] 

(by Lemma 3.1). 



5. Application: asymptotics for conditional and unconditional logistic odds 
ratio estimators. In this section the theory developed in the previous sec- 
tions is used to provide an asymptotic theory for the maximum likelihood 
conditional and unconditional logistic regression odds ratio estimators, /^tv 
and /fl^Y, under the nested case-control model. Conditions 5.1 and 5.2 ensure 
the asymptotic stability and nondegeneracy of data in the study base, which 
is sampled using schemes satisfying Condition 5.3. Lemma 5.1 shows how 
stability in the study base leads to stability in probability for case-control 
samples E. Theorems 5.1 and 5.2 give the consistency and asymptotic nor- 
mality of (3f^f and (3^ . We first consider asymptotically stable covariates in TZ 
and then specialize to the i.i.d. case. Previously, the weights xa,A& E were 
considered fixed, but here even if Xj,j G TZ are fixed, the values xa,A £ E 
arrive in E through random failure and control sampling. Suppressing ex- 
plicit dependence on /3q and (as usual) on TZ and its size A^, we indicate 
the study base model IPao,/3o given in (1) by P, and continue to denote the 
conditional distributions given E,r] by WE,r]- 

The first two conditions are on the stability of the study base data. 
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Condition 5.1. For all 6 G (0, 1) there exists C such that for all > 1, 

1 

N 



(60) ^Y.^(\^3\<C)>'^-^ 



and with pj given by (1) with x 

1 

N 



Pj ^ p as N ^ oo. 



Clearly, we then have r]/N ^p as N ^ oo by independence of the failure 
indicators. Furthermore, p € (0, 1), since with C corresponding to any 5 G 
(0,1) in (60), 

^T.PJ> f, ^^JM^)) I E 1(1^.- 1 ^C)>(^ inf p(z)) (1 - 5), 



which by Condition 1.1 is strictly positive for all > 1; likewise for qj, 
where pj + qj = 1. For Uj,j£Tl, let 

(61) UjSf = — E % U= sup UN- 

We say Uj is asymptotically stable in mean if 11^ — > u for some u as N ^ oo, 

asymptotically dominated in mean if \u\ < oo, and Uj{(3) uniformly asymp- 
totically dominated in mean if there exists a neighborhood Bq C B containing 
Pq, and Vj asymptotically dominated in mean, such that |iij(/3)| < vj for all 
P € Bf). For a continuous function w : [0, oo) — > M with lim^^^oo 'U^(2;) = L € 
(— oo, oo), we say uj is w-stable if \uj\'^ is asymptotically dominated in mean 
and for all A G [0,oo], Ujw{\xj)pj and Ujw{\xj)qj are asymptotically stable 
in mean. In what follows, we omit the specification "in mean." 

Condition 5.2. 1 is x/{l + x) stable, z?'^ is x/(l + x)^ stable for k = 
0,1,2, |zj|Mz^-p and |z^'p are uniformly asymptotically dominated, and 

for yj = (l,zj). 

The next condition is on the sampling design. 

Condition 5.3. For some / G (0, 1), 
(63) asTV^oo, 
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for Bj = l{j £ E) , uniformly over j G TZ, 

(64) E(S,U^D)^p; = |l^ 

and uniformly over all j ^ k in TZ, 

Cov{{Bj,Bk)\j ^'D,k^'D) ^0 asiV^oo. 

For / as in Condition 5.3, set r = (l/2)min(/, 1 — /) for application of 
Corollary 4.1. The connection between properties of Uj on TZ and their cor- 
responding in probability versions on E is made explicit by Lemma 5.1. We 
say QEi^) converges uniformly in probability to g{\) if sup;,,g[Q \9e{^) — 

g{X)\ ^0 as iV^oo. 

Lemma 5.1. Assume Conditions 1.1, 5.1 and 5.3 hold. 

(a) For all 5 G (0, 1), there exists e G (0, 1) such that for all N >1, 



(65) 



'(^El(^^e[e,e"^])>l-<5^>l-<5. 



Ifuj is asymptotically dominated, then for all 5 G (0,1), there exists K such 
that 



(66) 



'\^^^\Ua\<k\>1-5 for all N> I. 
A&E ) 



(b) If \uj\^ is asymptotically dominated, then Yar{N ^J2age^a)^0- 
If, in addition, ujpj and Ujqj are asymptotically stable, then 



(67) 



1^1 P 1 



1-/ 



1 + Pj'^Xqx 



l-p \_ 1 + AqX 

with pf given in (64). 

(c) If Uj is w- stable, then 

(68) g]^{X) = -^Y.UAw{XxA) 



as N ^ oo, 



1^1 AGE 



converges in probability uniformly to a continuous limit g^{X) as N ^ oo, 
having form (67) with u replaced by uuj{Xx). Hence, additionally, under 
Condition 5.2, 

hE{X) = T^r E PA^ ek,EW = ^T^T E ^TPA,xqA,\ 

1^1 A&E J 1^1 A&E 
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converge uniformly in probability to continuous functions h{X) and e/t(A) for 
k = 0,1,2 with form (67). 

(d) The limit function h{X) in part (c) strictly increases from to 1 as 
A increases from to oo. For f G (0, 1), 

A/ = Pj^Ao 

is the unique solution to h[\f) = f , and 



(69) ek = ek{Xf) = [^^^^qxjPx^. 

With hE{X) =rj/\E\, we have 

(e) i/|ujp and \vj\'^ are (uniformly) asymptotically dominated, then 
= t4| ( E ^aVapaqa + E ^aVb{pab -PaPb) ) 

I I \AeE A^B ) 

is (uniformly) bounded in probability. Ifl,Uj,Vj and ujVj are w{x) = x/{l + 
x)'^ -stable, then with the limit of given in (68), 



\AeD AeD / 



^5^^''(A/)-p/-V(A/)®<7^(A;)/eo(A;). 

(f) //juj-p is (uniformly) asymptotically dominated, then (uniformly) 

}^Y.UAilA-PA)^0. 

Age 

(g) // nonnegative weights wj are asymptotically dominated, for all 5 G 
(0, 1), there exists e > such that 



inf ^ y l{wj > e) > 1 



is asymptotically dominated and 

(70) liminf inf a'^FAra > where Tat = — V (n,- - un)^'^, 
N^oo |a|=l jen 
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then this same lower bound holds for 



'^N,w = (uj - UN,w)^'^ ^ , where UN,w = Yl 



Wj 

Uj: 



In particular, under Condition 5.2 with ek,k = 0,1,2 given in (69), 

(71) S = 62 — CQ^ef^ is positive definite. 

Proof. By considering coordinates, we will assume when convenient 
that Uj G M. To show (65), note that 

By Conditions 1.1 and 5.1, N~^Y.j&i^^Xj ^ [e,£~^]) can be made arbitrar- 
ily small for all > 1 by choosing e S (0, 1) sufficiently small. Now by (63) of 
Condition 5.3 and rj/N-^p, we have \E\/N ^p/f, and (65) follows. Claim 
(66) follows from 

AdE jelZ 

Chebyshev's inequality, and \E\/N -^p/ f . 

For (b) note that E is comprised of the set of failures D from TZ and a 
sample from the complement 7^ \ D. Hence, 

(72) IT.Ua = ^Y ^^^(j ^ D) + 1 E n,l(i eE\T>). 

Age j&n jeiz 

Apply Var(X + y) < 2(Var(X) + Var(y)) on the right-hand side of (72). For 

the first term, by independence, -^"^ Var(^jg^iijl(j G D)) < u"^. 

The indicators in the second term of (72) may not be independent. Write 
its variance as the sum of the diagonal term 

Y u]qjHB,\j i D)(l - q,nBj\j i D)) < TV^^^^ 0, 
and the covariance term, with cn = maxjjt | CoY{Bj,Bk\j ^'D,k ^ D)| , 

<C7v(^)^^0; 



1 

iV2 



Y'^j^kqjQk Cov{Bj,Bk\j ^'D,k^'D] 

hence, Var(iV-i Eage Ua) ^ 0. 
From (72), 



(73) E 



(^Yua)=^Y + ^ E mMB,\j i D). 
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Using (64) of Condition 5.3 and the fact that uj is dominated, the hmit of 
the difference between the expectation (73) and up^^ + uq^pf is zero. The 
first equaUty in (67) now follows from the first part, the stability conditions 
on Uj, that N/\E\ f /p and the definition of pf. The second equality now 
follows from (1), which gives the identity 

p q 1 —p N J -J 

Turning to (c) , since w is continuous with finite limit at infinity, and since 
the stability conditions hold in [0, oo], without loss of generality, through the 
mapping A ^ A/(l + A) say, it suffices to consider A G [0, 1]. Let u{j, A) stand 
for either Ujw[\xj)pj or Ujw{Xxj)qj. Since \\w\\ =sup;^g[Q y l^^(^)l < oo, we 
have |u(j, A)p < ||^«|p|uJ|^ and part (b) now shows that for all A, qeW 
converges in probability to g{\) having form claimed. It remains to show 
that the limit is continuous and that the convergence is uniform. 

Let 5 € (0, 1) be given. Since 

there is M > 1 such that for all E, 

(74) P{\Ul\ <K)>1- 5/6, where K = Mf'^/p. 
Assume for nontriviality that \\w\\ and u"^ are positive. Setting for short 

1^(6) = HXA i [e,e"^]), 

and using notation as in (61), by part (a), there exists e G (0, 1) such that 
for all E, 

I"(Ti?(e) < S'^ lil^^wfK)) > 1 - 6/6. 
Writing gE{\) for 5^ (A), let 

(75) 9£;(A) = 5r(A)+5r(A), 
where 

9%^W = T^ E UAwiXxA) 



1"^! A: a;^e[e,e-i] 



and 



32 R. ARRATIA, L. GOLDSTEIN AND B. LANGHOLZ 

Now applying the Cauchy-Schwarz inequality, with probability at least 1 — 

6/3, 

sup \g<>{X,)-g<>{X,)\ 
0<A2,Ai<l 

<2 sup \g^''iX)\ 

0<A<1 

<2|kll^ E \UA\lA{e)<2\\w\\{UllEie)f'<^-. 

Since w is uniformly continuous on [0, 1] , there exists r > such that 
if \y-x\<Tle \h<iu\w{y) -w{x)\<bl{2K^I'^). 
In particular, 

when Xj < if IA2 — Ai|<r then |u;(A2Xj) — i(;(AiXj)| < 5/(2Er"^/^). 
Hence, by (74), with probability at least 1 — (^/6, 

(A2) - <?r (Ai)| < E \VA\\n^{\2XA) - w{\^xa)\ 

Now by (75), for every 5 there is a r such that for all 

pf sup \gE{\2)-QE{\x)\<b\>\-bl2, 

V|A2-Ai|<T / 

and taking limits, sup|;^2-Ai|<T 15(^2) — 9(Ai)| < 5; hence, g{X) is continuous. 
Letting F\,..., Fm be a finite subcover of [0, 1] taken from the open cover 
of all open sub-intervals of length 2r and setting Xj to be the center of the 
interval Fj, there exists Nq such that for \E\ > Nq, 

n\9E{Xj) - g{X,)\ <5)>1- 6/2, j = l,...,M. 

Now (c) is finished, since for any A, there exists Xj with |A — Aj| < r, and 

\9EiX) - giX)\ < \gE{X) - 9£(A,-)| + |<7i?(A,) - giXj)\ + \g{X,) - giX)\, 

and so for all 5 there exists Nq such that 

for ah 1^1 > iVo sup \gE{X) - g{X)\ <36^>l-5. 

VAe[o,i] / 

To show (d), as in part (a), for given 6 G (0, 1), there exists e G (0, 1) such 
that for ah iV > 1, 
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Let < Ai < A2 < oo and set 

which is strictly positive. Since pj^x is nondecreasing in A, for all > 1, 

PX^N - P>^N ^ ^ bi,A2 - PjM)Pj > 7(1 - 

and, hence, px^p > PXiP', similarly, px^ > pxiQ- As the form of the limit 
function h is given by (67) with uj =Pj,x, h strictly increases from to 1 
as A increases from to 00. By continuity, for every / E (0, 1) there exists a 
unique A/ such that h{Xf) = f. 

Next, note that setting Xf = pj Aq, we have 

1 + Aox, )=Pf ^^'^^ 

which by (67) gives 

hE{Xf)^^—j^pj'^p = f 

and the claimed representation of ek{Xf). 
Last, since hE{X{E,r])) = r]/\E\, 

h{\) - h{Xj) = h{X{E,rj)) - hE{X{E,r^)) - (^/ - ^) i^O, 

we have 

X-^ Xf as li^l — > 00, 

since h{X) is continuous and strictly increasing. 
For (e), by Corollary 4.1, the correspondence 

^£,A = ^yeo£;(A), 

and the (uniform) domination assumed on Uj , Vj , we have that 

1 1 
— UaVapaqa + tt^t X! UaVb{pab -PaPb) 

1^1 A^E 1^1 A^B 

is in probability (uniformly) within Ot-(1) of 

1 1 p 

T^\YUAVAPA,\qA,X- Y^-; Y UAVBPA,xqA,XPB,\qB,\eo^EW^ 
I I A&E I I J A,B&E,A^B 
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which by the Cauchy-Schwarz inequahty and (66) is (uniformly) bounded 
in probabihty. Adding in the diagonal term in the double sum, we see that 
the quantity above is (uniformly) within Ot-(1) of 

5r(A)-p/-^5^(A)5^(A)e-^(A). 

Part (e) now follows from (c) and (d). 

For part (f), note the given expression has conditional mean zero given 
{E,T]), and apply part (e) with Vj = Uj. 

For (g), let for e G (0, 1), Fat = F^J + F^j:, where 



and similarly define 



1 



uii>e Wi<e 



Applying Holder's inequality, 

2/3 / , \ 1/3 



jen / \ jen 

since \uj\^ is asymptotically dominated, F^ can be made arbitrarily small 
by choice of e. Hence, letting 7 be the value of the liminf in (70), there 
exists e > such that 

(76) liminf inf a"^F^a> 27/3 and \ul^\'^<-f/3 for ah iV. 

N^oo |a|=l 

With > the standard partial ordering on positive definite matrices, for any 
S, 

{uj - vf^ > (uj - usf^ for = E ^nd all E 
jeS j&S I I jeS 

so for this £, 

Wj >£ -J Wj >£ 

> E i-j--Jr > - (u'^'r) > o 

Wj>e 

by (76). Since the weights wj = qj,\fPj,\o satisfy the given conditions, F > 
by (62) of Condition 5.2. □ 
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Theorem 5.1. Consider a study base IZ of N individuals with disease 
probability given by the proportional odds model (1) and a case control sam- 
pling design giving rise to the likelihood (4). // Conditions 1.1 and 5.1-5.3 
are satisfied, there exists a consistent and asymptotically normal sequence 
(^N ^/ ™ots of the likelihood equation Cn{(3) = 0; with S as in (71), 

Pm'^Pq and ViV(^^-/3o)^AA(0,S-i). 

Proof. We follow Theorem VI. 1. 1 in Andersen, Borgan, Gill and Keid- 
ing (1993) from Billingsley (1961). For consistency it suffices to show that 
as ^ oo, 



(77) iV-^Z^(/3o)AO, iV-^J(/3o)As, 

and, with i?(/3) = d2{(3)/d(3, that there is a finite constant K such that for 
some neighborhood Bq cB of (3q, 

(78) lim F{\N~^R{(3)\ < K for all /3 G 5o) = 1. 

The first claim in (77) and that N""^ times (7) tends to zero in prob- 
ability follow from Lemma 5.1, part (f). Condition 5.2 and the fact that 

N/\E\ f /p. The second claim in (77) now follows from (6) and Lemma 
5.1, part (e). 

Turning to (78), write, for example, for X^Aed^A, so by (5), 
(79) 

+ Covs,^(Zi„ZD) +Es,^Z^(/3)^l 

Divided by N, the term inside the first parentheses tends to zero uniformly 
in probability over Bq by Lemma 5.1, part (f ) and Condition 5.2. The covari- 
ances are uniformly bounded in probability upon division by N by Lemma 
5.1, part (e). 

Last, the final term (79) over \E\ expands to terms of three types. For the 
diagonal. 



< 



Age 



for the double sums of the following form apply Corollary 4.1 to see that 
1 

W\ 




PAf{lB-PB, 



B\ 
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and for the triple sums, by Corollary 4.1, 

^ ZA®^B®^C^E,ni.lA-PA){lB-PB){Ic-Pc) 



\E\ 

I I \{A,B,C}\=i 
< Or(l) 



B,C}\=3 V \ AeE / 



I I \{A,B,C} 

These terms are uniformly bounded over Bq by Condition 5.2, giving the 
existence of the required K in (78) and completing the proof of consistency. 

By the Cramer- Wold device, to prove the asymptotic normality claim, it 
suffices to show 

(80) ^b'^/(/3o) ^ AA(0, b'Sb) for ah nonzero b G W. 

V A'' 



For e > 0, define 



2 f\^\wf fX^ '^^^^ 



G.E = {AeE: 



b'ei,s(A) 



b'Z, 



>eaEVN], 
eo,EW } 

Le,E = -irr^ 2^ b Z^ — — PA,\qA,x 



and 



eE = mi{e:Ls^E<£}- 



Hajek's (1964) CLT, with the variables ua replaced by b'Z^p^ ,\, gives (80) 
P 

if — > as N ^ oo. By Holder's inequality, 

2/3 / \ 1/3 



1^1 AGE 



< 



which tends to zero in probability for all e > by Conditions 5.2 and 5.1 
and Lemma 5.1, part (a). Since cr^ is of order 0(1) in probability, L^^e^O. 
□ 

Turning now to the unconditional logisitic likelihood, for simplicity we 
parameterize A = exp(Q!), let a£;^ = log(A) and recall that maximizes 
(8). 
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Theorem 5.2. Under the conditions of Theorem 5.1 
I ON - aE,n\ d ,r^r.r^~^ Ten ^ 0^ 



(81) 
where 







T 



eo ej 
ei 62 



Proof. We proceed as in the proof of Theorem 5.1. Since 

dlog{l + Xxa) , dpx,A 

5 =P\,A and — =Px,Aqx.A, 

oa oa 

taking first and second partial derivatives of the logarithm of (8) with re- 
spect to (a,/3), the unconditional logistic score and information are given, 
respectively, by 



(82) 
and 



U{X,(3)=Y.{Ia-Pa,x) 

AeE 



1 



(83) T(A,/3)=^ 

AeE L 
By (82) and (46), 

(84) Ui^Po) 



1 zl 

r, ry(g)2 

^A '^A 



PA,\qA,X 







+ 



0"^ 

OY.ilA-pA,x)Z'A 







Za{pa-Pa,\) 

.A€E 



By Corollary 4.1, pA — PA,\ = Op{N ^), so by (a) of Lemma 5.1, 

(85) Y.^A{PA-PA,x) = Op{l). 

AeE 

In view of (77), N~^U{\,(3q) ^0. Handling the second term in J(A,/3o) in 
this same manner and applying (c) of Lemma 5.1 to the first, 

iV-ij(A,/3o)^T. 

By bounding pj^xQj^x below and following a similar but simpler argument as 
in (g) of Lemma 5.1, we have that T > by (62) of Condition 5.2. 

Next we consider the remainder term. Writing y"*" = {1,^^) and 7^ = 
{a, (3^), taking the derivative of X with respect to 7 yields 

^(7) = E Yf{pA,xqXx - pXxQa,x) + E (^A + 1a ® Yi)pA,xqA,X 



A&E 



A€E 



+ J2{Ia- PA,x)YX - PA,xqA,xYA ^ Y'a. 
A£E 
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of which ah terms, once divided by ^, are uniformly asymptotically dom- 
inated by Condition 5.2. 

By (84), (85), (80) and Slutsky's lemma, 

N~^^'^U{\,Po)^MiO,V) where ^''''"^ 



(86) 



00 
s 



The proof is completed by applying the well-known partitioned matrix 
inverse formula, 



-1 



eico 



e S- 



and observing 



eieo 



1 OT 





eo^O 



□ 



We note from Theorems 5.1 and 5.2 the conditional and unconditional 
logistic maximum likelihood estimators of the odds ratio parameter /3 have 
the same asymptotic distribution since (T~^)^^^ = 

The following specialization of Theorems 5.1 and 5.2 is a direct conse- 
quence of the law of large numbers. 



Theorem 5.3. Let 'Zij,j G TZ, be i.i.d. replicates of Z. Then the con- 
clusions of Theorems 5.1 and 5.2 hold when Conditions 1.1, 5.1 and 5.3 
are satisfied, -E|Zj|^ < oo, there exists an integrable random variable which 

positive definite. 



Zi'jl"' and |Zj'| in a neighborhood Bq C B of I3q, and Var(Z) is 



When Zj in the study base are independent with common distribution 

Zj = Z, where Z has distribution function G, the case-control set {E,rj) 
consists of r/ and \E\ — rj covariates with distribution functions Gi,Gq, re- 
spectively, where 



0,1, 



Ep*(Z)(l-p(Z))i 

with p(z) as in (1). Then p = Ep(Z), and the asymptotic distribution of Z^ 
in the case-control study is therefore given by G/-, where 

1-/^1 + A/ x(z) 



(iG;(z)=/dGi(z) + (l-/)dGo(z) 



dG(z), 



1 - p Vl + Ao x(z) , 
and the functions /i_e'(A) and c^^eW converge uniformly in probability, re- 



spectively, to h{\) =E/[pj-a] and efe(A) = E/[Z®^pj- a^j.a]- 
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6. Discussion. 

Local central limit theorem expansion for the Poisson Binomial distribu- 
tion. The distribution of the sum of independent Bernouhi random vari- 
ables with differing probabilities of success has no simple form. Theorem 2.1 
gives an expansion, with rates, to any desired accuracy. 

Rejective sampling: inclusion and correlations. The probability that an 
individual is included in a simple random sample has a simple form. Theorem 
3.1, which gives an expansion for the probability of inclusion in a rejective 
sample, shows how special the equally weighted simple random sampling 
special case is. 

Additionally, the decay rate of the high-order correlations for inclusion in a 
rejective sample (9) has not been previously studied, even for simple random 
sampling. Theorem 4.1 shows that (with \H\ = k) the kth order correlations 
decay at the rate |^| ™°d2)/2^ that is, the odd correlations decay at the 
same rate as the next even one. In the case of simple random sampling, we 
have conjectured in Section 1.2 the values of the limiting constants. 

Sampling designs. Table 1 is a list of control sampling methods most 
commonly used in unmatched case-control studies. The designs are classified 
as "case-control" type when sampling is done directly from the controls in 
the study base, and as "case-base" type when the sampling is from the 



Table 1 

Examples of sampling methods that satisfy Condition 5.3 with the parameters to yield 
case-proportion f in the case-control set 



Design'^ 
type 


Sampling'^ 
method 


Observed'^/ 
expected 


Sampling method 
to yield / 


C/C 

C/C 

c/c 


SRS 
SRS 
BT 


Obs 
Exp 
Obs 


Exactly |D|(1 - /)// controls 
Exactly Np{l - f )/f controls 
Sample controls with prob n ^'i'di 


c/c 


BT 


Exp 


Sample controls with prob ^-jJ- 


CB 


SRS 


Obs 


Exactly iVi^|Dl/(7V - |D|) from study base 


CB 


SRS 


Exp 


Exactly N^^—j^p/{l — p) from study base 


CB 


BT 


Obs 


Sample with prob i^|D /(Af — |D ) from study base 


CB 


BT 


Exp 


Sample with prob i-j^p/(l — p) from study base 



^C/C — case-control, CB — case-base 

'^SRS — simple random sampling, BT — Bernoulli trials 
"^Observed or expected number of cases 
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study base without regard to case-control status. Each can be sub-classified 
according to whether the sampling is by simple random sampling without 
replacement or by independent Bernoulli trials, and whether the number of 
subjects to be sampled is determined by the "observed" |D| or "expected" 
Np number of cases. Each design satisfies Condition 5.3. The fourth column 
in the Table 1 provides the parameters for the chosen sampling design that 
yield asymptotic case-proportion /. Thus, under the stated conditions on the 
covariates in the study base. Theorems 5.1 and 5.2 apply for each design. 

Conditional and unconditional logistic regression. Theorems 5.1 and 5.2 
provide the asymptotics of the conditional and unconditional logistic likeli- 
hood estimators of the odds ratio parameter under very broad conditions. 
The asymptotics for the conditional estimator for this wide variety of sam- 
pling schemes are new; see Table 1. Those for the unconditional estimator 
extend its validity to a much wider range of applications. 

Under Conditions 1.1 and 5.1-5.3, these two estimators have the same 
asymptotic distribution. Thus, from a statistical efficiency standpoint, ei- 
ther may be used. Generally, permutation likelihoods are computationally 
quite intensive, with complexity increasing exponentially with sample size 
[Liang and Qin (2000)]. However, exploiting the simplifications possible with 
a dichotomous outcome, a recursive algorithm for the conditional logistic 
likelihood reduces the order of computation to linear in rj [Cox (1972) and 
Gail, Lubin and Rubinstein (1981)], the same order as for the unconditional 
logistic likelihood. This algorithm has been implemented in a number of com- 
puter software packages. Since the unconditional estimator is biased when 
the number of cases is small [Breslow and Day (1980)], the conditional esti- 
mator may be preferred in situations where the case-control study consists 
of multiple case-control sets, some with small numbers of cases. 

Comparison to the analysis of individually matched case-control studies. 

In earlier work, we studied the asymptotic behavior of conditional logistic 

(partial likelihood) estimators of the rate ratio from individually matched 

(nested) case-control data [Goldstein and Langholz (1992) and Borgan, Goldstein and Langholz 

(1995)]. 

In the individually matched case-control setting, the within case-control 
set variability is constant with sample size and the asymptotics are driven by 
the increasing number of case-control sets. The situation for the unmatched 
case-control setting that we studied here is very different. There is a single 
(or a fixed number, see Extensions below) case-control set, and the number of 
cases in the set increases with sample size. Consequently, a very different set 
of analytic techniques is required for individually matched and unmatched 
case-control study designs. 
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Comparison to the retrospective model. It is of interest to compare our 
development of the asymptotic theory of the unconditional logistic estima- 
tor to that developed under the retrospective model by Prentice and Pyke 
(1979). In contrast to our results, which only require asymptotic stability 
of the covariates, the asymptotic theory developed under the retrospective 
model assumes that the are random variables with realizations that are 
i.i.d. conditional on the failure indicator I a. As the antihypertensive drug-MI 
study example in Section 1 illustrates, the identical distribution assumption 
may not hold in practice. 

Furthermore, we note that the retrospective model is actually semipara- 
metric, the unknown parameters being (Go,/3), the control covariate distri- 
bution and the odds ratio parameter. Hence, efficiency questions regarding 
this model must be addressed by considering Gq as an infinite-dimensional 
nuisance parameter. On the other hand, the nested case-control model con- 
sidered here is parametric, leaving such questions amenable to simpler anal- 
ysis. 

Interestingly, the derivation of the asymptotic theory in Prentice and Pyke 
(1979) is quite different from the one given here. In particular, up to the 
scaling factor of f/p which appears here, the asymptotic information T 
is the same for both models but the asymptotic variance of the score un- 
der the retrospective model is T — {f~^ -|- (1 — f)~^)[eo eJ]^[eo ej], com- 
pared to V in (86). In spite of this difference, the asymptotic distribu- 
tion of the estimator /3 obtained using the unconditional logistic likeli- 
hood is the same under both models. The same is almost true for a, ex- 
cept that Cq ^ in the nested case-control model variance (81) is replaced 
by + (1 — f)~^ in the retrospective model variance [Prentice and Pyke 
(1979), page 408], the difference being explained by the choice of center- 
ing values, which here is aE,ri, and in Prentice and Pyke (1979) is 6. Not- 
ing that (/-I + (1 - /)-^)-i = /(I - /) = EfipA,Xf)EfiqA,Xf) and that 
^of /p = ^fiPA,XfQA,\f), it can be shown that the nested case-control vari- 
ance associated with a is smaller than its retrospective model counterpart 
due to the extra conditioning here on the Z^. 

Efficiency. The maximum unconditional logistic likelihood estimator has 
been shown to be efficient under the retrospective model [Breslow, Robins and Wellner 
(2000)]. These authors assume that (/yi,Z^) are i.i.d., a somewhat more re- 
stricted setting than that considered by Prentice and Pyke (1979). An open 
question is under what conditions are f3 and /3 efficient for all designs that 
satisfy Condition 5.3 under the nested case-control model. It would seem 
that the number of cases r] has no information about /3q so that the likeli- 
hood Pa,/3(D|£', ?]) conditioning additionally on rj should not result in loss 
of information relative to the likelihood ¥x^f3(D\E). The asymptotic theory 
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for estimators based on Fx^p(D\E) has not yet been developed (indeed, the 
results in this paper are a relevant step to develop such theory) so that it is 
not possible to compare. However, we show that the asymptotic variances 
of the odds ratio estimators based on Fx,/3(D\E,r]) and Fx^/3(D\E) are equal 
in the following three important special cases. 

Simple random sampling of controls. This class of designs, where a fixed 
number of controls is sampled from the study base, includes frequency 
matching and sampling a fixed number of controls proportional to the "ex- 
pected" number of cases (i.e., the C/C-SRS entries in Table 1). For these 
designs, the number of cases is a function of the number in the case-control 
set so that Fx^pCDlE) =¥x,p(D\E,r]). 

Full study base. Condition 5.3 clearly holds with / = p, so noting that the 
full, efficient likelihood Pa,/3(D|7^) has the form of an unconditional logistic 
likelihood, by Theorem 5.2 and 5.1 both the conditional and unconditional 
likelihood are efficient for f3, and in particular have the same asymptotic 
variance. 

Independent Bernoulli trials sampling of controls with probability p. Un- 
der this independent control sampling design. Condition 5.3 holds with 
f = p/{p + (1 — p)p) and Pa,/3(D|£^) has the form of an unconditional lo- 
gistic likelihood, and the desired conclusion follows as for the full study 
base. 

Extensions. The extension to sampling controls from each of a fixed num- 
ber of large strata is straightforward. Consider a failure probability model 
given by (1), with baseline odds parameters for individuals in stratum s, 
and control selection independent between strata. For each s, let be the 
solution to 



where Eg and r]s are the case-control set and the number of cases from stra- 
tum s, respectively. Suppose the limiting fractions 7^ of subjects in stratum s 
exist and are positive, and that Conditions 5.1-5.3 are satisfied by all strata. 
Then the conclusions of Theorems 5.1 and 5.2 hold with S = 7^5]^, where 
Sjj is the stratum s contribution to the score of form (71). 

Usually disease is rare and efforts are made to enroll all cases into a case- 
control study. The reasons that cases are not enrolled may depend on a 
variety of factors, including the death of the patient or physician refusal. 
If nonenrollment can be modeled as i.i.d. Bernoulli (pcase) events, then the 
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REJECTIVE SAMPLING ASYMPTOTICS 



43 



theory can easily be extended to accommodate such case selection. Specifi- 
cally, the probability of sampling the case is absorbed into the baseline by 
replacing Aq in (1) by AoPcase; the theory proceeds without further change. 

We have used the "observed information," —dU{P)/d(3, in our analysis. 
In data analysis, it is more common to use the "expected information," 
which is the conditional expectation over case-occurrence of the information 
lE,rj and Ie for the conditional and unconditional likelihoods, given in (6), 
(7) and (83), respectively [Thomas (1981)]. Because taking this expectation 
eliminates the term (7) in lE,r] and a corresponding term in 2e that was 
asymptotically negligible, it is immediate that the "expected information" 
is a consistent estimator of the asymptotic information. 

Further work. That A/ = \o/Pf suggests that Aq can be estimated us- 
ing the unconditional logistic likelihood when the number of subjects in 
the study base (and thus the proportion of cases) is known. This has been 
done under (essentially) Bernoulli trials by Weinberg and Wacholder (1993), 
and under independent simple random sampling of (cases and) controls by 
Scott and Wild (1986) and Breslow and Cain (1988), but further work is 
needed to accommodate general Condition 5.3 sampling. In particular, there 
is nonnegligible variability in the difference A — Aj that depends on the sam- 
pling design, and which needs to be accounted for in the estimation of Aq- 

It is of interest to know when the techniques used here can be generalized 
to accommodate other forms of conditioning on information 5, as in likeli- 
hood (3). The particular case of no conditioning, 5 = 0, represents a "full 
likelihood" under the nested case-control model. The difficulty is finding an 
analog to the independent product measure Tx in Lemma 3.5. 
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